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ABSTRACT  (continued) 

Two  channel,  two  frequency,  versions  of  the  basic  pattern  discrimination 
experiment  were  begun.  In  these  experiments  the  frequency  of  the  tones  in  the  second 
sequence  was  different  from  the  first  (by  more  than  a  critical  band)  and  these  tones  were 
presented  in  the  contralateral  earphone  channel.  Little  decrease  in  pattern  discrimination 
performance  was  observed  compared  to  experiment  1.  Performance  dropped  as  the  time 
separation  between  the  two  sequences  was  reduced.  Performance  was  worst  when  the  two 
sequences  began  to  overlap.  When  there  was  only  a  very  small  time  delay,  t,  separating 
each  tone  of  the  two  sequences,  performance  was  exceedingly  good.  The  latter  result  is 
explained  by  the  presence  of  periodicities  generated  on  trials  when  the  sequences  are 
correlated.  As  t  gets  increases,  the  frequency  of  the  prominent  periodicity  becomes  too 
low  for  the  system  and  observer  processing  shifts  from  a  spectral  to  a  trace-dependent  and 
then  to  a  context-coding  basis.  This  transition  is  very  interesting  because  it  may  provide  a 
bridge  between  elementary  psychophysical  experiments  on  noise  correlation 
discrimination  (or  binaural  detection),  and  results  with  longer  stimuli  such  as  tonal 
sequences.  The  experimental  paradigm  may  support  a  model  of  performance  that  is 
applicable  over  three  different  modes  of  processing-spectral,  trace,  and  context-as  a 
function  of  the  single  task  parameter,  sequence  offset  time. 
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I.  Temporal  Pattern  Discrimination 
A.  Introduction 


^  The  goal  of  this  research  is  to  understand  how  human  listeners  encode,  store,  and 
compare  the  temporal  patterns  defined  by  two  tonal  sequences.  The  general  experimental 
paradigm  requires  the  listener  to  decide  whether  two  arrhythmic  tonal  sequences  have  the 
same  or  different  temporal  patterns.  This  comparison  process  appears  to  be  accomplished 
in  different  ways,  depending  on  the  relative  timing  of  the  sequences.  In  one  case,  listener 
behavior  is  described  by  a  process  that  computes  the  correlation  between  (encoded  and 
stored)  lists  of  intertone  onset  intervals.  In  the  second  case,  listener  performance  is 
modeled  as  a  running  computation  of  the  correlation  between  the  whole  waveforms  of  the 
filtered  input  sequences.  In  a  later  section,  we  expand  these  descriptions  and  summarize 
some  preliminary  data. 

Several  investigators  have  studied  the  perception  of  partially  unstructured  or 
arrhythmic  temporal  sequences.  Lunney  (1974)  showed  that  the  discrimination  of 
irregularity  in  tempo,  introduced  into  the  fourth  click  of  the  output  of  a  metronome,  was 
an  exponential  function  of  the  period,  in  a  range  of  period  durations  from  30  ms  to  3200 
ms.  Pollack  studied  the  perception  of  temporal  gaps  within  trains  of  very  brief  pulses 
f  Pollack;  1967, 1968a,)  and  the  perception  of  periodicity  and  jitter  in  pulse  trains 
(1968b,c,d).  Pollack  found  that  the  threshold  for  gap  discrimination  increased  with  the 
interpulse  interval,  for  interpulse  intervals  greater  than  10  ms.  In  general,  performance 
was  best  when  the  pulse  trains  contained  large  numbers  of  intervals  and  had  very  short 
interpulse  intervals.  Pollack  suggested  that  the  processing  of  trains  with  very  short 
interpulse  intervals  involved  a  spectral  mode  of  processing,  while  long  interpulse  intervals 
( >  10  ms)  probably  required  a  temporal  processing  mode. 

Sorkin,  Boggs,  and  Brady  (1982)  studied  the  perception  of  tone  sequences  with 
randomly  jittered  temporal  patterns.  Their  subjects  heard  two  sequences  of  n  tones:  one 
sequence  had  a  fixed  intertone  interval  and  the  other  had  jitter  added  to  the  intertone 
intervals.  Subjects  had  to  detect  which  sequence  had  the  added  jitter.  Sorkin  et  al.  found 
that  discrimination  improved  with  the  number  of  intervals  and  decreased  with  the  average 
duration  of  the  intervals  (the  durations  ranged  from  20  to  110  ms).  Their  results  were 
consistent  with  temporal  discrimination  data  employing  single,  marked  time  intervals 
(Creelman,  1962;  Getty,  1975;  Divenyi  and  Danner,  1977;  Divenyi  and  Sachs,  1978;  and 
Allen,  1979). 

Sorkin  et  al.  (1982)  proposed  a  statistical  model  of  jitter  detection,  in  which  the  timing 
of  different  frequency  tones  was  monitored  (and  compared)  across  separate  critical  band 
channels;  discrimination  of  time  jitter  within  a  critical  band  channel  was  much  better  than 
across  channels.  Performance  increased  in  the  expected  way  with  the  number  of  tones  in 
each  sequence  and  with  the  different  regular  frequency  patterns  employed.  However, 
when  the  frequency  patterns  were  random,  listener  performance  was  well  below  the 
model’s  predictions. 

In  a  similar  experiment,  Halpern  and  Darwin  .  >82)  presented  subjects  with  a 
sequence  of  four  clicks  which  marked  three  intervals;  their  subjects  had  to  indicate 
whether  the  last  interval  was  shorter  or  longer  than  the  preceding  two.  Halpern  and 
Darwin  tested  base  durations  ranging  from  400  to  1450  ms.  Discrimination  performance, 
as  measured  by  the  standard  deviation  of  the  resulting  psychometric  functions,  was  an 
increasing  function  of  the  base  duration;  the  resulting  Weber  fraction  was  about  0.05, 
consistent  with  that  reported  by  Getty  (1975)  and  Sorkin  et  al.(  1982). 

Recently,  Schulze  (1989)  reported  a  variation  of  the  Halpern  and  Darwin  experiment 
in  which  subjects  were  asked  to  report  whether  the  last  of  n  interval:  marked  by  tones,  was 
longer  or  the  same  as  the  n-1  preceding  intervals.  Schulze  used  base  durations  of  from  50 
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to  400  ms  and  from  2  to  6  intervals  in  each  sequence.  Schulze  tested  an  hypothesis  similar 
to  that  of  the  Sorkin  et  al.  model  about  the  expected  improvement  in  discnminability  with 
number  of  intervals.  For  most  of  the  subjects,  discrimination  improved  with  the  number 
of  intervals.  Schulze  failed  to  find  evidence  for  a  Weber’s  law  effect;  for  his  subjects,  the 
discrimination  limen  was  between  5  and  15  ms  and  independent  of  the  base  duration. 

B.  Single  Channel  Sequence  Discrimination 

We  have  completed  experiments  (Sorkin,  submitted)  in  which  the  listener  was  asked  to 
compare  two  arryhthmic  tonal  sequences  and  report  whether  the  temporal  patterns  were 
the  same  or  different.  The  two  sequences  were  either  identical  or  had  partially  correlated 
temporal  envelopes.  This  task  is  a  generalization  of  the  Sorkin  et  al.  (1982)  jitter 
detection  paradigm.  An  advantage  of  these  paradigms  is  that  the  information  carrying 
aspects  of  the  sequences  is  distributed  throughout  the  sequence,  rather  than  concentrated 
on  one  judged  interval  as  in  the  Halpern  ana  Darwin  (1982)  and  Schulze  (1989) 
experiments.  The  goal  of  our  experiments  was  to  test  whether  a  listener’s  ability  to 
perform  sequence  comparison  can  be  described  by  a  process  in  which  the  listener 
computes  the  correlation  between  the  sequence  temporal  envelopes. 

In  these  sequence  discrimination  tasks,  listeners  compared  two  tone  sequences, 
each  composed  of  n,  1000  Hz  tone  bursts  of  35  ms  duration  at  a  sound  pressure  level  of 
approximately  71  dB.  Tone  bursts  were  shaped  by  a  4  ms  linear  rise  and  decay  envelope. 
After  listening  to  the  pair  of  tone  sequences  presented  on  each  trial,  the  subject  had  to 
respond  whetner  or  not  the  temporal  pattern  of  tones  was  the  same  or  different.  There 
were  two  types  of  experimental  trials:  trials  on  which  the  identical  sequence  of  tones  and 
intertone  intervals  (gaps)  were  presented  (SAME  trials),  and  trials  on  which  the  pattern  of 
intertone  gaps  was  different  in  the  two  presented  sequences  (DIFFERENT  trials).  On 
trials  when  tne  sequences  were  different,  the  only  difference  between  the  sequences  was  in 
the  pattern  of  intertone  gaps  and  tone  onsets.  The  firstpart  of  figure  1  illustrates  a  SAME 
trial;  the  second  part  illustrates  a  DIFFERENT  trial.  The  type  of  trial  was  chosen  at 
random,  with  p(SAME)  =  .5. 


(A)  SAKE 


(B)  DIFFERENT 


Figure  1.  The  envelopes  of  typical  tone  sequences  are  shown  for  same  and  different  trials. 
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The  intertone  gaps  were  generated  by  a  process  that  enabled  us  to  control  the  mean 
gap  duration,  n  the  standard  deviation  of  the  gaps,  a3ap)  and  the  correlation,  ^ex, 
Between  the  two  gap  sequences  on  trials  when  the  sequences  were  different.  "Hie 
intertone  gaps  were  constructed  by  combining  three  independently  generated  normal 
deviates,  with  one  deviate  common  to  the  two  sequences.  Gap  durations  of  less  than  2  ms 
were  not  allowed.  The  sequence  correlation  is  given  by  the  ratio  of  two  variances,  the 
variance  common  to  the  two  sequences  divided  by  the  sum  of  the  common  and  unique 
variances  (Jeffress  and  Robinson,  1962): 


9  =  a2  /  [  a1  +  a1  1 

V.  ex  com  '  L  com  un  J 

and 

a1  =  [  a1  +  a2  ] 

gap  L  com  un  J 


(i) 

(2) 


where  com  and  mi  refer,  respectively,  to  the  common  and  unique  portions. 


Discrimination 


A  simple  model  of  observer  performance  in  the  temporal  pattern  discrimination  task 
follows  from  the  assumption  that  the  observer  computes  the  correlation  between  the  two 
sequences  of  gaps  (or  tone  onsets)  presented  on  each  trial.  Suppose  that  the  observer’s 
response  is  based  on  the  value  of  the  Pearson  product-moment  correlation  coefficient 
statistic,  r12,  computed  on  the  sample  of  intertone  gaps  defined  by  the  pair  of  sequences, 

<'tT,1  '*1,2  '  *  *  ’  *T,n  >  and  '  "*2,2  '  *  *  >* 

A  transformation  of  the  correlation  coefficient,  known  as  the  Fisher  r  to  Z  transformation, 
is  defined  as: 


1 

Z  =  -  In 
2 


1  + 
1  - 


(3) 


The  sampling  distribution  of  Z  is  distributed  approximately  normally,  for  gaps  drawn 
from  a  normal  distribution  and  for  n  of  at  least  moderate  size  (n  ~  10).  If  f  is  the 
population  correlation  coefficient,  the  mean  and  standard  deviation  of  Z  are  then  given  by 
(Brunk,  1960): 


1  +e 


+ 


2n  -  1 


(4) 


and 


-V2 

a2=  (n  -  3) 


(5) 


Discrimination  performance  can  be  obtained  from  the  normalized  difference  between 
the  means  of  the  Z  statistic,  given  the  possible  hypotheses  on  a  trial:  SAME,  when  (  =  1.0 
and  DIFFERENT,  when  {  =  fex  .  The  discriminability,  d\  is  given  by  the  difference 
between  the  means  of  the  Z  statistic  divided  by  the  standard  deviation  of  Z.  (The 
contribution  of  the  right  hand  term  of  equation  4  is  very  small.) 


For  ?  human  observer,  the  effective  correlation  between  the  sequences  on 
DIFFERENT  trials  will  depend  on  (  ,  a  ,  and  the  magnitude  of  internal  variability  in 
the  observer’s  encoding  and  storage  of  the9  gaps.  We  assume  that  the  observer’s 
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observation  of  the  gaps  is  subject  to  a  temporal  jitter,  cxJirf  and  that  this  jitter  is 
uncorrelated  across  tne  gap  sequences.  Adding  this  uncorrelated  jitter  crJ  to  equations 
(1)  and  (2),  yields: 


e 


01 FF 


com 


a1  +  o2  +  a? 

com  un  in 


(6) 


and  from  equations  (1)  and  (2)  and  #  =  1.0,  the  effective  correlation  on  SAME  trials, 


e 


SAME 


(7) 


The  magnitude  of  the  internal  temporal  jitter  a  jn  is  the  single  parameter  of  this  model. 
Because  the  internal  jittci  is  independent  between  the  two  sequences,  it  acts  to  reduce  the 
effective  correlation  of  the  sequences. 


Discrimination  performance  can  be  calculated  using  equations  (4),  (6)  and  (7)  to 
compute  the  difference  between  the  means  of  the  Z  statistic  on  DIFFERENT  and  SAME 
trials  divided  by  the  standard  deviation  of  Z: 


1 

1+  ^SAME 

^SAME 

1 

1+^DIFF  fDIFF 

— 

+ 

T 

i 

i 

i 

i 

i 

c 

fH 

1 

- In  ( 

- )  — - 

2 

1- P 

A  L  SAME 

2n-l 

2 

1“Ciff  2n-l 

-1/2 

(n-3) 


Effect  of  Sequence  Correlation  and  Variability 

We  examined  how  discrimination  performance  depended  on  the  correlation  between 
the  sequences  f  (as  specified  on  DIFFERENT  trials,  since  (  =  1  on  SAME  trials)  and 
the  standard  deviation  of  the  intertone  gaps  a  .  and  we  estimated  the  magnitude  of  the 
internal  noise,  ajn. 

Figure  2  chows  the  data  from  four  observers  at  a  mean  gap  duration  of  50  ms  and  a  gap 
standard  deviation  of  20  ms.  The  vertical  bars  in  the  figures  indicate  plus  and  minus  one 
standard  error  of  the  mean.  The  solid  lines  in  figure  2  are  least  square  fits  of  the  model  to 
each  observer’s  average  data;  the  value  of  the  internal  jitter  parameter  is  shown  in  each 
section  of  the  figure.  The  observed  drop  in  performance  with  increases  in  the  correlation 
of  the  sequences  is  consistent  with  the  model.  The  value  of  the  (single)  internal  temporal 
jitter  parameter  was  14.75  ms,  for  the  fit  of  the  model  to  the  average  data  from  the  four 
listeners.  This  value  for  internal  jitter  is  at  the  high  end  of  the  range  of  values  obtained  in 
duration  discrimination  experiments  employing  single  and  multiple  judged  intervals 
(Lunney,  1974;  Getty,  1975;  Divenyi  ana  Danner,  1977;  Halpern  and  Darwin,  1982;  Sorkin, 
Boggs,  and  Brady,  1982;  and  Schulze,  1989). 
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Performance  (c‘)  Performance  (d') 


Correlation 


Correlation 


Figure  2.  Performance  (<V)  is  plotted  as  a  function  of  the  sequence  correlation,  for  each  of 
four  observers.  The  solid  lines  show  the  performance  of  the  correlation  model  with  the 
internal  noise  standard  deviation  shown. 


Figure  3  shows  how  average  performance  depended  on  the  standard  deviation  of  the 
gap  duration.  The  vertical  bars  indicate  plus  and  minus  one  standard  error  of  the  mean; 
the  average  standard  errors  for  the  four  observers  are  shown  for  each  condition.  The  solid 
line  is  the  prediction  of  the  correlation  model,  using  the  value  of  the  internal  jitter  based 
on  the  average  data  of  Figure  2.  As  the  level  of  external  variability  in  the  gaps  increases, 
the  contribution  of  internal  and  (assumed)  uncorrelated  variability  is  reduced,  and 
performance  should  improve.  It  is  apparent  that  the  model  overestimates  performance  at 
high  gap  standard  deviations. 


Figure  3.  The  average  performance  of  four  observers  (d’)  is  plotted  as  a  function  of  the 
standard  deviation  of  the  gaps.  The  solid  line  is  the  prediction  of  the  correlation  model 
with  an  internal  noise  of  14.75  ms. 
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We  also  examined  how  discrimination  performance  depended  on  the  mean  gap 
duration  jx  a  and  on  the  number  of  intertone  gaps,  n.  As  the  mean  gap  is  increased, 
server  performance  decreased  at  an  increasing  rate.  The  model,  as  defined  by 
juations  (6),  (7),  and  (8),  was  modified  to  incorporate  this  effect,  by  assuming  a  Weber’s 
Law  type  of  dependence  of  the  internal  jitter  on  the  magnitude  of  MgaR-  Such  a 
relationship,  where  a.  increases  in  proportion  to  u  *  has  been  found  by  Lunney  (1974), 
Getty  (1975),  Divenyi  and  Danner  (1977),  Halpern  ana  Darwin  (1982),  and  Sorkin,  Boggs, 
and  Brady,  (1982). 

The  modified  model  did  well  at  predicting  performance  as  a  function  of  the  mean 
gap;  however,  the  model’s  decrease  in  performance  with  increasing  gap  size  was  less  than 
that  shown  by  the  human  observers  at  mean  gaps  of  80  ms  or  more.  Some  part  of  the  drop 
at  long  gaps  may  be  attributable  to  the  fact  that  spans  of  1  s  or  longer  exceed  the  capacity 
of  the  observer’s  auditory  memory  and  hence  the  effective  number  of  intervals  being 
processed  is  much  smaller  than  assumed  by  the  model  (see  Watson,  1987).  Similar  effects 
occurred  when  the  number  of  intertone  gaps  was  manipulated.  Good  fits  with  the  model’s 
predictions  were  obtained  as  long  as  the  number  of  intertone  gaps  did  not  exceed  12  or 
more. 


These  experiments  support  the  idea  that  the  discrimination  of  temporally  perturbed 
tone  sequences  may  be  described  as  a  process  in  which  the  listener  computes  the 
correlation  between  the  temporal  envelopes  of  the  sequences.  This  computation  appears 
to  be  limited  by  an  internal  temporal  variability  (of  approximately  15  ms.)  in  the  listener’s 
encoding  and  storage  of  the  stimulus  information.  This  variability  is  about  10  ms  higher 
than  difference  thresholds  obtained  using  two-interval  duration  discrimination  tasks, 
depends  on  the  magnitude  of  the  base  duration  to  be  discriminated,  and  increases  when 
the  time  span  of  the  sequences  is  longer  than  1  s.  or  when  the  sequences  have  more  than 
12  intervals.  These  latter  effects  probably  are  related  to  encoding  and  memory 
limitations. 

C.  Extensions  to  the  Theory  and  Qngoing/Planned  Experiments 

In  this  section  we  discuss  some  implications  of  the  above  results  for  correlation 
theories  of  pattern  discrimination.  We  argue  that  different  realizations  of  the  correlation 
mechanism  may  hold  under  different  task  conditions.  We  describe  some  experiments  to 
specify  the  nature  of  the  mechanism  under  those  conditions. 

Possible  Mechanisms 

The  idea  that  a  listener  can  compare  auditory  patterns  by  computing  the  correlation 
between  temporal  or  spectral  aspects  of  the  patterns,  is  not  novel.  Models  of  the  binaural 
detection  mechanism  have  typically  involved  the  assumption  of  a  process  that  involves 
computation  of  the  interaural  correlation  between  the  left  and  right  auditory  channels 
(Durlach,  1963;  Osman,  1971;  Lindemann,  1986;  and  cf.  Sorkin,  1965,  and  Pohlmann  and 
Sorkin,  1974).  Several  investigators  have  studied  the  binaural  discrimination  of  changes  in 
the  interaural  whole-waveform  correlation  of  the  signals  (e.g.  for  wideband  noise:  Pollack 
and  Trittipoe,  1959;  for  pulse  train  polarity  agreement:  Pollack,  1971;  and  for  wideband, 
narrowband,  and  low-pass  noise:  Gabriel  and  Colburn,  1981).  These  studies  have 
reported  a  dependence  of  discrimination  on  interaural  correlation  that  is  consistent  with 
the  hypothesized  correlation  process. 

Recently,  Richards  (1987)  reported  an  experiment  on  the  discrimination  of  differences 
between  simultaneously  presented  noise  stimuli  having  partially  correlated  amplitude  (and 
spectral)  envelopes.  Richards  postulated  a  correlation  discrimination  process  that  is 
essentially  identical  to  the  one  we  have  proposed  to  describe  sequence  comparison.  Her 
noise  stimuli  had  bandwidths  of  100  Hz  and  center  frequencies  of  2500  and  2750  Hz.  For 
any  given  stimulus,  these  two  noise  bands  had,  on  average,  a  specified  correlation.  The 
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observers  had  to  discriminate  which  of  two  such  stimuli  contained  the  higher  correlation 
across  the  spectral  bands.  Richards  tested  her  observers’  ability  to  discriminate  between  a 
reference  stimulus,  containing  either  a  zero  or  unit  noise  correlation,  and  target  stimuli 
having  a  range  of  noise  correlations.  In  general,  her  results  supported  the  model:  the 
observers’  sensitivity  to  changes  in  envelope  correlation  was  a  monotonic  function  of  the 
computed  z  statistic  and  was  essentially  independent  of  the  specific  reference  correlation. 

In  the  binaural  studies  and  in  Richard’s  noise  study,  one  assumes  that  the  listener 
can  compute  the  correlation  between  the  transduced,  critical-band  filtered  signals;  the 
signals  are  assumed  to  undergo  minimal  processing  prior  to  the  correlation  operation.  It 
is  possible  that  a  similar  process  is  operating  in  the  sequence  discrimination  task:  The 
signals  in  each  sequence  are  transduced,  subjected  to  windowing  and  filtering  operations, 
and  then  stored;  finally,  the  correlation  is  computed  between  the  resulting  waveforms. 

A  more  cognitive  mechanism  may  be  appropriate  for  describing  the  listener’s 
correlation  computation  in  the  sequence  discrimination  task.  Using  this  mechanism,  the 
listener  (behaves  as  though  he/she)  encodes  and  stores  only  the  magnitudes  of  th§  time 
intervals  between  the  tone  onsets.  The  listener  then  computes  the  correlation  between  the 
resulting  two  lists  of  interonset  times.  This  view  of  the  correlation  process  implies  quite 
different  relationships  between  the  task  characteristics  and  performance.  For  example, 
the  computation  of  correlation  based  on  two  lists  of  stored  numbers  should  be  relatively 
insensitive  to  certain  transformations  of  the  sequences  such  as  temporal  compression  or 
expansion.  This  is  in  contrast  to  the  whole-waveform  correlation  mechanism,  which  might 
be  expected  to  be  highly  sensitive  to  such  transformations.  This  distinction  between  an 
input  or  waveform-based  process,  and  a  more  highly  processed  mode  is  similar  to  the  trace 
and  context  processing  modes  postulated  by  Durlach  and  Braida  and  their  colleagues,  and 
discussed  in  a  number  of  studies  (see  Durlach  and  Braida,  1969;  also  cf.  Sorkin,  1987). 


Figure  4.  Average  performance  plotted  as  a  function  of  the  time  between  the  onset  of 
each  sequence  in  the  two  channel  experiment.  (Average  sequence  duration  =  630  ms.) 
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Two  Channel  Sequence  Discrimination 

We  are  attempting  to  distinguish  between  these  alternative  mechanisms,  in  ongoing 
experiments.  The  discrimination  task  was  modified  so  that  the  tones  within  each  sequence 
were  presented  at  two  different  frequencies  (all  of  the  tones  in  sequence  1  were  at  1000 
Hz,  all  tones  in  sequence  2  were  at  2300  Hz)  and  the  two  sequences  were  presented  to 
different  earphone  channels.  When  the  sequences  were  presented  dichotically  and  at 
different  frequencies,  essentially  the  same  performance  was  obtained  as  in  the  previous 
sequence  exper'-nents  (see  Figure  4).  The  point  plotted  in  figure  4  at  a  sequence  offset  of 
1375  ms  shows  that  performance  in  the  dichotic,  separated  frequency,  condition  was 
similar  to  performance  when  the  frequencies  were  the  same  and  presented  in  the  same 
ear.  As  the  time  separation  between  the  two  sequences  was  reduced,  performance  fell. 

As  the  sequences  began  to  overlap  in  time,  performance  dropped  markedly  until  at 
approximately  50%  overlap,  performance  was  lowest.  These  results  indicate  that  listeners 
can  process  sequence  temporal  pattern  information  across  two  frequency  (and  ear) 
channels  at  time  separations  and  performance  levels  similar  to  those  when  the  frequencies 
and  channels  were  identical. 

Notice  that  subjects  could  perform  the  two-channel  discrimination  task  extremely 
well  when  the  two  sequences  were  presented  at  the  same  time-so  long  as  the  sequences 
are  not  offset  by  more  than  approximately  20  ms.  Without  invoking  specific  binaural 
mechanisms,  it  is  clear  that  a  great  deal  of  spectral  information  about  the  relative 
similarity  of  the  patterns  is  potentially  available  to  the  listener,  when  the  sequence  offset  is 
less  than  20  ms.  Suppose  that  the  offset  were  5  ms  and  that  both  channels  contained  the 
SAME  sequence  pattern:  all  tones  would  result  in  cross-channel  pairings  having  a  5  ms 
separation.  If  the  channels  contained  DIFFERENT  temporal  patterns,  there  would  be  a 
distribution  of  cross-channel  pairings,  with  a  reduction  in  the  prominent  periodicity  (at 
1/offset),  depending  on  the  correlation  between  the  channels.  As  the  offset  were 
increased  from  0  to  10  ms  and  greater,  the  peak  in  the  (same  sequence)  cross-channel 
spectrum  would  decrease  to  100  Hz,  eventually  reaching  a  point  where  spectral  processing 
was  not  feasible.  As  the  offset  was  increased  beyond  that  point,  the  alternative  "temporal" 
processing  mode  would  be  invoked  and  performance  would  be  similar  to  that  observed  in 
the  long  delay,  single  channel,  sequence  experiment. 

The  two  channel,  sequence  discrimination  task  is  interesting  because  it  may  provide 
a  bridge  between  psychophysical  experiments  on  correlation  discrimination  with  noise  or 
binaural  stimuli,  and  more  context  sensitive  or  "cognitive"  sequence  experiments.  An 
intermediate  mode  of  processing  may  be  operating  as  well:  As  the  time  between 
sequences  is  increased  beyond  the  limit  of  the  whole  waveform  correlation  mechanism, 
the  listener  may  try  to  (briefly)  store  the  whole  waveforms  for  later  comparison.  Thus,  a 
short  term  memory  requirement  is  imposed  at  offsets  longer  than  20  ms.  As  the  offset 
becomes  very  long,  memory  trace  noise  becomes  excessive,  and  there  is  sufficient  time  for 
the  interonset  intervals  to  be  encoded  and  stored  as  representations  of  the  time  pattern. 
The  system  then  performs  its  correlation  computations  in  the  interonset  timing  or 
"context"  mode.  Thus,  an  exciting  aspect  of  the  two  channel  sequence  paradigm  is  that  it 
may  support  a  model  of  performance  that  is  applicable  over  three  different  modes  of 
processing-spectral,  trace,  and  context-as  a  function  of  the  single  task  parameter,  offset 
delay. 

Temporal  Manipulations 

We  have  begun  to  examine  performance  in  single  channel,  long  offset  delay, 
sequence  discrimination  experiments  in  which  the  second  sequence  has  been  scaled  in 
time  (compressed  or  expanded)  by  a  factor  of  from  0.6  to  1.4.  The  preliminary  results  of 
these  manipulations  are  consistent  with  the  predictions  of  the  correlation  model;  the 
effect  of  the  time  transformation  is  small  and  is  approximately  a  symmetric  function  of  the 
time  scaling  factor.  These  manipulations  should  nave  a  smaller  effect  on  the  interonset 
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time  mode  than  on  the  whole-waveform  mode.  The  former  mechanism,  based  on  encoded 
and  stored  lists  of  numbers,  should  be  relatively  insensitive  to  uniform  scaling  of  the  lists 
(or  any  paired  portion  of  the  list).  The  whole-waveform  mechanism,  however,  should  be 
very  sensitive  to  such  temporal  transformations,  since  the  computed  correlation  (on 
SAME  trials)  will  be  greatly  reduced.  It  would  be  necessary  for  this  mechanism  to 
compute  the  correlation  at  a  number  of  delays  in  order  to  make  a  decision  about  whether 
the  sequences  were  the  same  or  different. 

These  temporal  manipulations  will  be  applied  to  the  two-channel  sequence  task.  In 
the  two  channel  task,  the  compression  or  expansion  will  be  applied  randomly  over  trials, 
limited  to  a  smaller  range  (10-15%),  and  uniformly  applied  over  each  sequence.  We  will 
compare  the  effects  of  the  temporal  scaling  manipulations  on  performance  at  very  brief 
offsets,  at  intermediate  offsets,  and  at  long  offsets.  Our  expectation  is  that,  at  long  offsets, 
the  results  will  be  similar  to  those  described  in  our  preliminary  single  channel 
experiments,  supporting  the  interonset  time  mechanism.  At  snort  offsets,  we  expect  that 
performance  will  be  much  more  sensitive  to  the  scaling  manipulation,  indicating  a  whole 
waveform,  spectral  type  mechanism. 

Frequency  Manipulations 

The  listener’s  subjective  impression  of  the  single-frequency  sequence 
discrimination  task,  is  of  trying  to  recall  and  compare  two  briefly  heard  rhythmic  patterns. 
That  observation,  the  relatively  long  interonset  intervals  employed  in  the  task,  and  the 
small  effect  of  changing  the  frequency  of  all  of  the  tones  in  tne  second  sequence,  support 
the  argument  that  the  listener  is  using  a  temporal  rather  than  spectral  processing  mode. 
We  have  characterized  this  mode  as  the  interonset  timing  mode.  Because  this  mode 
requires  encoding  and  storage  of  the  timing  information,  it  is  likely  that  it  will  be 
dependent  on  contextual  factors  such  as  the  nature  and  distribution  of  different  frequency 
tones  in  the  sequences  and  within-  and  across-trial  variation  in  the  frequency  of  the  tones. 

The  literature  on  the  perception  and  production  of  temporal  patterns  includes 
many  studies  that  demonstrate  the  influence  of  sequence  temporal  structure  on  spectral 
pattern  discrimination  (Deutsch,  1980;  Jones,  1981;  Jones,  Kidd,  and  Wetzel,  1981;  Jones, 
Boltz,  and  Kidd,  G.,  1982;  and  Monahan,  1987)  as  well  as  the  influence  of  sequence 
spectral  pattern  on  temporal  pattern  discrimination  (Woods,  Sorkin,  and  Boggs,  1979; 
Handel  and  Lawson,  1983;  Espinoza- Varas,  and  Jamieson,  1984;  Espinoza- Varas  and 
Watson,  1986;  and  Sorkin,  1987).  The  model  of  temporal  jitter  detection  supported  by 
Sorkin  et  al.  (1982)  assumed  that  best  performance  would  occur  when  the  tones  marking 
the  intervals  were  within  a  critical  band  in  frequency.  In  that  experiment,  the  detection  of 
jitter  in  sequences  containing  different  frequency  tones  was  predictably  poorer  than  with 
equitone  sequences.  A  similar  assumption  may  enable  the  correlation  model  to  describe 
pattern  comparisons  between  multiple  frequency  tone  sequences.  For  example,  the 
listener  might  compute  the  correlation  between  the  temporal  envelopes  of  tone 
subsequences  defined  only  within  a  single  critical  band.  Correlations  computed  within 
separate  critical  bands  then  could  be  combined,  in  order  to  arrive  at  a  composite  estimate 
of  the  temporal  similarity  of  the  sequences. 

We  have  indicated  that  changing  the  frequency  of  all  tones  in  the  second  sequence 
does  not  degrade  performance  when  the  offset  is  either  very  short  or  very  long.  However, 
this  manipulation  did  not  involve  the  presence  of  uncertainty  about  the  frequency  of  the 
tones  within  a  sequence  (or  pair  of  sequences).  If  both  sequences  on  a  trial  have  the  same 
pattern  of  tone  frequency,  we  should  be  able  to  predict  performance  based  on  the  results 
from  single  frequency  sequences  at  those  frequencies.  Relative,  rather  than  absolute 
timing  accuracy  should  be  important  in  the  interonset  timing  mode;  a  reduction  in  timing 
accuracy  due  to  timing  intervals  across  critical  bands,  should  not  produce  large  effects  on 
performance.  Such  a  manipulation  should  not  affect  the  whole  waveform  mode  because 
only  the  envelope  information  is  relevant  to  the  computation.  Similarly,  the  effect  of  a 
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random  frequency  condition,  so  long  as  the  same  random  sequence  pattern  occurs  in  each 
sequence,  should  be  small. 

The  preceding  predictions  apply  to  cases  when  the  frequency  pattern  of  each 
sequence  on  a  trial  is  the  same  (or  when  there  is  no  frequency  uncertainty  over  trials). 

The  effect  of  frequency  pattern  uncertainty  within  an  experimental  trial  is  potentially 
more  complex.  How  well  can  a  listener  discriminate  between  two  temporal  patterns  on  a 
trial,  when  the  frequency  patterns  of  the  sequences  vary  within  the  trial?  We  would  expect 
this  manipulation  to  have  a  small  effect  on  processing  in  the  whole  waveform  mode  and  a 
large  effect  on  the  interonset  timing  mode.  The  effect  on  the  whole  waveform  mode 
would  be  minimal  for  the  reasons  crted  in  the  previous  paragraph.  However,  this  type  of 
contextual  uncertainty  should  interact  with  the  encoding  and  storing  operations  required 
by  the  interonset  timing  mode.  The  goal  of  the  experiments  is  to  evaluate  these  effects 
over  a  range  of  timing  manipulations  and  to  incorporate  the  results  into  a  general  model 
of  sequence  pattern  aiscrimination. 
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II 


Information  Processing  with  Multi-Element  Sources 
(This  project  was  partially  supported  by  the  Naval  Weapons  Center,  China  Lake,  CA.) 

A  detection  theory  theorem  (Starr  et  al.,  Radiology.  1975,  116.  533)  predicts 
performance  in  a  recognition-detection  task  (subject  responds  whether  signal  a  or  b  or  c 
or  no  signal  was  present)  from  data  obtained  in  the  component  detection  tasks  (respond 
whether  signal  a  or  no  signal  was  present).  The  theorem  was  evaluated  in  a  visual  display 
processing  task  employing  nine-element  linear  arrays  of  analogue  gauges.  The  subject's 
task  was  to  decide  which  signal  had  occurred  on  each  trial.  The  gauge  values  were 
generated  by  statistical  processes  having  different  mean  values  depending  on  whether  a 
signal  or  noise  was  present;  the  signals  were  defined  by  different  patterns  of  mean  gauge 
values.  The  signals  were  designed  to  be  equally  detectable  and  mathematically 
orthogonal.  The  subjects  detected  each  of  the  three  signals  separately,  as  well  as  all 
combinations  of  the  three  signals.  The  Starr  Theorem  provided  good  predictions  of 
performance  in  the  recognition-detection  tasks  based  on  performance  in  the  component 
detection  tasks.  This  work  was  reported  at  the  annual  meeting  of  the  Human  Factors 
Society,  in  Denver,  October,  1989,  and  in  Elvers  (1989). 

A  second  study,  using  the  same  types  of  stimuli,  but  in  a  single-signal  detection 
version  only,  was  run  under  conditions  in  which  the  statistical  properties  of  the  display 
elements  were  non-uniform.  That  is,  the  means  of  the  display  elements,  given  signal  and 
noise,  varied  depending  on  spatial  position;  thus  the  diagnosticities  of  the  display  elements 
varied.  Performance  in  this  task  was  analyzed  using  a  theorem  derived  by  Dr.  Bruce  Berg 
for  computing  the  observer  decision  weights  in  a  task  involving  the  detection  of  auditory 
sequences  (Dr.  Berg  is  in  Dr.  David  Green’s  laboratory  at  Florida).  The  results  of  these 
experiments  indicate  that  it  is  very  difficult  for  the  observer  to  employ  optimal  weights  in 
the  processing  of  display  information  having  variable  diagnosticity  (reported  in  Elvers, 

1989). 


III.  Other  Activities 

Member,  National  Research  Council  Committee  on  Hearing,  Bioacoustics,  and 
Biomechanics  (CHABA). 

Panelist,  CHABA  Working  Group  on  Classification  of  Complex,  Non-Speech  Sounds. 
Associate  Editor,  International  Journal  on  Human-Computer  Interaction. 


IV.  Project  Personnel 

Elvers,  G.  C.,  Assistant  in  Psychology.  Department  of  Psychology,  University  of  Florida. 
While  at  Florida,  Dr.  Elvers  completed  his  Ph.D.  dissertation  and  received  his 
Doctor  of  Philosophy  degree  from  Purdue  University.  He  left  the  project  to  take  a 
position  at  the  University  of  Dayton. 

Pezzo,  M.,  Graduate  Student.  Department  of  Psychology,  University  of  Florida.  Mr. 
Pezzo  left  the  University  of  FLorida  in  August,  1989,  to  continue  his  graduate 
studies  at  Ohio  University. 

Widman,  D.,  Graduate  Student.  Department  of  Psychology,  University  of  Honda.  Ms. 
Widman  began  work  on  the  project  in  May,  1989. 

Sorkin,  R.  D.,  Professor  and  Chair.  Department  of  Psychology,  University  of  Florida. 


16 


V,  Publications.  Reports.  Manuscripts.  Dissertations 

Elvers,  G.  C.  (1989).  Detection  of  visual  signals  consisting  of  multiple  information  sources: 
A  signal  detection  analysis.  Unpublished  doctoral  dissertation,  Purdue  University, 
West  Lafayette,  IN. 

Sorkin,  R.  D.,  Kantowitz,  B.  H.,  and  Kantowitz,  S.  C.  Likelihood  Alarm  Displays,  Human 
Factors.  1988,  IQ,  445-459. 

Sorkin,  R.  D.  Review  of  M.  Loeb,  "Noise  and  Human  Efficiency",  American  Journal  of 
Psychology.  1988,  101.  290-293. 

Sorkin,  R.  D.  Why  are  people  turning  off  our  alarms?  Journal  of  ih£  Acoustical  Society  of 
America.  1988,  g4,  1 107-1108.  Reprinted  in  Human  Factors  Society  Bulletin.  1989, 
22,  3-4. 

Makhoul,  J.,  Crystal,  T.H.,  Green,  D.  M.,  Hogan,  D.,  McAulay,  R.J.,  Pisoni,  D.B.,  Sorkin, 
R.D.,  and  Stockham,  T.G.,Jr.  Removal  of  Noise  From  Noise-Degraded  Speech 
Signals.  Committee  on  Hearing,  Bioacoustics,  and  Biomechanics,  National 
Research  Council,  National  Academy  Press,  Washington,  DC,  1989. 

Yost,  W.  A.,  Braida,  L.  D.,  Hartmann,  W.  M.,  Kidd,  G.  D.  Jr.,  Kruskal,  J.  B.,  Pastore,  R. 

E.,  Sachs,  M.  B.,  Sorkin,  R.  D.,  Warren,  R.  M.  Classification  of  Complex 
Nonspeech  Sounds.  Committee  on  Hearing,  Bioacoustics,  and  Biomechanics, 
National  Research  Council,  National  Academy  Press,  Washington,  DC,  1989. 

Sorkin,  R.  D.  and  Elvers,  G.  C.  Analysis  of  Automated  Decision  Systems.  Final  Report  of 
Naval  Weapons  Center  Contract  N60530-88-C-0213,  June  1989. 

Sorkin,  R.  D.,  Wightman,  F.  L.,  Kistler,  D.  J.,  and  Elvers,  G.  C.  An  exploratory  study  of  the 
use  of  movement-correlated  cues  in  an  auditory  head-up  display.  Human  Factors. 
1989, 21, 161-166. 

Elvers,  G  C.  and  Sorkin,  R.  D.  Detection  and  recognition  of  multiple-element  visual 
displays.  Proceedings  of  th£  Human  Factors  Society.  1989  (in  press). 

Sorkin,  R.  D.,  Mabry,  T.  R.,  Weldon,  M.,  and  Elvers,  G.  Integration  of  information  from 
multiple  element  displays.  Organizational  Behavior  and  Human  Decision  Processes 
(manuscript  submitted). 

Sorkin,  R.  D.  Perception  of  temporal  patterns  defined  by  tonal  sequences.  Journal  of  the 
Acoustical  Society  of  America  (manuscript  submitted). 

Barfield,  W.,  Salvendy,  G.,  and  Sorkin,  R.  D.  Judgments  on  the  angular  orientation  of 
three-dimensional  (3D)  images  displayed  in  virtual  3D  space.  Ergonomics 
(manuscript  submitted). 


17 


